The following coursebook is the main part for Online Data Science Series: Visualizing Geospatial Data in R workshop produced by the team at Algoritma . The coursebook is intended for a restricted audience only, i.e. the individuals having received this coursebook directly from the training organization. It may not be reproduced, distributed, translated or adapted in any form outside these individuals and organizations without permission.
Algoritma is a data science education center based in Jakarta. We organize workshops and training programs to help working professionals and students gain mastery in various data science sub-fields: data visualization, machine learning, data modeling, statistical inference, etc.
Before you go ahead and run the codes in this coursebook, it’s often a good idea to go through some initial setup. Under the Training Objectives section we’ll outline the syllabus, identify the key objectives and set up expectations for each module. Under the Libraries and Setup section you’ll see some code to initialize our workspace and the libraries we’ll be using for the projects. You may want to make sure that the libraries are installed beforehand by referring back to the packages listed here.
Geospatial data is data about objects, events, or phenomena that have a location on the surface of the earth. The data combines location information (usually coordinates on the earth), attribute information (the characteristics of the object, event, or phenomena concerned), and often also temporal information (the time or life span at which the location and attributes exist)1.
Geospatial analysis uses this data to build maps, graphs, statistics, and cartograms, making complex relationships understandable. It introduces a formal techniques using geographic and geometry properties of a data. There are various business problems that can be solved using geospatial analysis, namely a few:
There are actually a plethora of tools that can visualize geographic information from full-scale GIS (Geographic Information System) applications such as ArcGIS and QGIS to web-based tools like Google maps to any number of programing languages. But, with advantages and disadvantages to these different types of tools, using a command-line interface has a steep learning curve, but it has the benefit of enabling approaches to analysis and visualization that are customizable, transparent, and reproducible2.
This 4-days online workshop is a beginner-friendly introduction to Geospatial Analysis in R. By visualizing geospatial data, users can have more intuitive decision making by contextualizing data in the real world, comparing informations across the city, state, or country.
This is the first online data science series course of Geospatial Analysis in R. The primary objective of this course is to provide a participant a comprehensive introduction about tools and software for visualizing a geospatial data using the popular open-source tools: R. The material will covers:
Introductory Module:
tidyverse
ggplot2Main Module:
ggplot2leaflet - a JavaScript API for creating interactive mapsleafletrmarkdown versatile outputflexdashboard packageIn this Library and Setup section you’ll see some code to initialize our workspace, and the packages we’ll be using for this project.
Packages are collections of R functions, data, and compiled code in a well-defined format. The directory where packages are stored is called the library. R comes with a standard set of packages. Others are available for download and installation. Once installed, they have to be loaded into the session to be used.
You will need to use install.packages() to install any packages that are not yet downloaded onto your machine. To install packages, type the command below on your console then press ENTER.
## DO NOT RUN CHUNK
# packages <- c("tidyverse", "maps","rgdal","tmap","sf")
#
# install.packages(packages)Then you need to load the package into your workspace using the library() function. Special for this course, the rmarkdown packages do not need to be called using library().
Geospatial analysis, as the main topic of this workshop, is a domain of analysis that focuses on data processing that is associated with geographic data. With unlisted potentials for many business domain analysis, in this main section of the workshop, we will try to learn the building blocks of geospatial data and combines it with a business use case example in order to visualize our data as an insightful map.
We will explore how to use and obtain geospatial data in R to create some of the most popular types of thematic maps; choropleth (static and interactive) and interactive heatmap.
Let’s have a quick glimpse on a simple geospatial visualization in R by running the code chunk below:
The code above projected countries around the world as a static map. But, how does geographic data information stored? How could we obtain detail information on geographic contours or territorial borders?
Figure 2.1: Illustration on Vector vs. Raster
The two most widely used geographic data models for spatial data are vector and raster. A vector data represents location and shape of geographic features through geometric shapes such as points, lines and polygons. Raster data, on the other hand, consists of values within a grid system. You can imagine raster data as a pixelated digital image such as a satellite imagery or a scanned map.
There are lots of data formats that can be used to store each data model. KML, GeoJSON, GeoTIFF, Tab File, are some of the common formats you might ever came accross. However, the most common format in geographic information system mapping is the Shapefile.
Shapefile is the universal standard of geospatial format developed and regulated by Esri, the international supplier of geographic information system softwares. The type format is then adopted by so many programming languages, including R.
The name Shapefile might be a little deceptive since the file is made up of at least three separate files:
.shp includes the geometry data.shx includes the index data used to identify different geometries.dbf includes attribute information of each geometry’s data.For this workshop, I have provided a shapefile for Indonesian territory under data/gadm36_IDN_3 folder of this directory. Notice that inside the directory, there are reading there are additional files other than 3 mandatory files listed above:
.prj includes the information of the coordinate reference system.cpg includes encoding information for the variable attribute stored in .dbf filesThe shapefile itself is made available by GADM. If you go ahead to the website you can pick out different countries spatial vector. Indonesia’s spatial vector is provided up to 4 levels of granularity. In this course we’ll be working with 3 levels of spatial vector that contains: Provinsi, Kabupaten and Kecamatan.
Extra Tip:
You can also access GADM database directly from R with the help of GADMTools library:
# Code example to load Indonesia's spatial vector in Kabupaten/Kota level
library(GADMTools)
library(sf)
map <- gadm_sf_loadCountries(c("IDN"), level=2, basefile = "./")
plotmap(map) Let’s read in the files using st_read() function from sf package:
#> Reading layer `gadm36_IDN_3' from data source `/home/tanesya/Documents/Algo: 5. Others/DSS - Geospatial/gadm36_IDN_3' using driver `ESRI Shapefile'
#> Simple feature collection with 6696 features and 16 fields
#> geometry type: MULTIPOLYGON
#> dimension: XY
#> bbox: xmin: 95.0097 ymin: -11.00762 xmax: 141.0194 ymax: 6.076941
#> geographic CRS: WGS 84
sf represents “Simple Features” as records in a data.frame or tibble with a geometry list-column3. Simple Features is a hierarchical data model that represents a wide range of geometry types. Of 17 geometry types supported by the specification, only 7 are used in the vast majority of geographic research 4:
Figure 2.2: Source:Geocomputation with R
If you use the class() function for our newly created idn3 object you can retrieved how the sf class handles spatial data in similar way as any tabular data structure; by stores it as a dataframe:
#> [1] "sf" "data.frame"
The main difference between a regular dataframe an an sf object is that it has geometry which describes where on Earth the feature is located, and they have attributes, which describe other properties:
#> Geometry set for 6696 features
#> geometry type: MULTIPOLYGON
#> dimension: XY
#> bbox: xmin: 95.0097 ymin: -11.00762 xmax: 141.0194 ymax: 6.076941
#> geographic CRS: WGS 84
#> First 5 geometries:
To get a better idea about geometry information in an sf object, let’s go ahead and plot idn3$geometry using plot() function:
There are several informations that is stored in an sf’s geometry:
POINT, LINESTRING, POLYGON, MULTIPOINT, MULTILINESTRING, MULTIPOLYGON and GEOMETRYCOLLECTION.XY), 3-(XYZ/XYM) or 4-dimensional (XYZM) space.A CRS is a fundamental component of geospatial data. It models earth surface into a mathematical model. Intuitively, you could think of it as a way to model a 3 dimensional surfaces such as earth, into a 2 dimensional surface that is commonly used in geospatial analysis: making maps, distance calculation, etc. Take a look at the following images for better illustration:
Figure 2.3: Source:DataCarpentry
If we were using geospatial data from different sources, it is important to make sure the data we are using has the same CRS attribute. A different CRS would not represented in the same mathematical space if combined and would alter any calculation done on the data significantly.
The big advantage of using sf is how you each functions can be combined using %>% operator and works well with the tidyverse collection of R packages, which gives us better control over the geometry information in sf objects. For example, to subset certain provinces, you can use filter() method just like how you work with regular dataframe:
You can also perform aggregation operations over the geometry:
Dive Deeper
Using the same method as above, try to plot a map which shown Indonesian province of North Sumatra in the city (Kabupaten/Kota) level!
Other than sf’s simple features, there are actually other methodologies for storing model of geographical features into R. If you ever ran into a geospatial studies in R, you may also familiar with the use of sp package. In fact, sp was a very-well developed package since 2005 which practically supported almost every GIS analysis in R, even until now.
The main problem of sp is its low compatibility to R dataframe structure. sf was built to fill the gap. Released in 2016, sf uses OGC (Open Geospatial Consortium) & ISO standard on recording and structuring spatial data with simple features.
The disadvantage of sf is, since it is relatively new, some spatial packages may have not yet added support for sf object. Fortunately, we can stil convert an sf object to spatial class used in sp:
#> [1] "SpatialPolygonsDataFrame"
#> attr(,"package")
#> [1] "sp"
Spatial objects can be converted back to sf in the same way or with sf’s st_as_sf() function:
#> [1] "sf" "data.frame"
Notes:
If you want to learn more about sp and how to work with spatial data using sp ecosystem, you can refer to “sp-example.Rmd” in this directory.
Now let’s head back to our listings.csv, a data consists of properties listed for sale in Jabodetabek area:
#> id kota kecamatan harga KT KM m2
#> 1 1 Jakarta Pusat Gambir 3900000000 3 2 169
#> 2 2 Jakarta Pusat Kemayoran 1450000000 3 2 160
#> 3 3 Jakarta Pusat Gambir 11100000000 10 8 720
#> 4 4 Jakarta Pusat Kemayoran 485000000 2 1 35
#> 5 5 Jakarta Utara Tanjung Priok 5600000000 6 5 240
#> 6 6 Jakarta Barat Tamansari 3000000000 10 5 180
Here, we want to compare the house pricing for each sub-district (Kecamatan) level. Since the size of the listed properties may vary, we’ll use the price per square meter instead:
df_agg <- df %>%
mutate(
harga_m2 = harga / m2
) %>%
group_by(kota, kecamatan) %>%
summarise(harga_m2 = median(harga_m2),
total_listings= n()) %>%
ungroup()
head(df_agg)#> # A tibble: 6 x 4
#> kota kecamatan harga_m2 total_listings
#> <chr> <chr> <dbl> <int>
#> 1 Bekasi Tarumajaya 12281250 463
#> 2 Depok Beji 11746032. 84
#> 3 Depok Cimanggis 11679365. 110
#> 4 Depok Cinere 12761905. 88
#> 5 Jakarta Barat Cengkareng 13040000 455
#> 6 Jakarta Barat Grogolpetamburan 15032680. 223
To project our housing data into a map, we need to equip each observation in the data with geographical information representation. We can do this by joining the information of the city (kota) & sub-district (kecamatan) to NAME_2 & NAME_3 variables respectively in idn3:
#> GID_0 NAME_0 GID_1 NAME_1 NL_NAME_1 GID_2 NAME_2 NL_NAME_2
#> 1 IDN Indonesia IDN.1_1 Aceh <NA> IDN.1.2_1 Aceh Barat <NA>
#> 2 IDN Indonesia IDN.1_1 Aceh <NA> IDN.1.2_1 Aceh Barat <NA>
#> 3 IDN Indonesia IDN.1_1 Aceh <NA> IDN.1.2_1 Aceh Barat <NA>
#> 4 IDN Indonesia IDN.1_1 Aceh <NA> IDN.1.2_1 Aceh Barat <NA>
#> 5 IDN Indonesia IDN.1_1 Aceh <NA> IDN.1.2_1 Aceh Barat <NA>
#> 6 IDN Indonesia IDN.1_1 Aceh <NA> IDN.1.2_1 Aceh Barat <NA>
#> GID_3 NAME_3 VARNAME_3 NL_NAME_3 TYPE_3 ENGTYPE_3
#> 1 IDN.1.2.1_1 Arongan Lambalek <NA> <NA> Kecamatan Sub-district
#> 2 IDN.1.2.2_1 Bubon <NA> <NA> Kecamatan Sub-district
#> 3 IDN.1.2.3_1 Johan Pahlawan <NA> <NA> Kecamatan Sub-district
#> 4 IDN.1.2.4_1 Kaway Xvi <NA> <NA> Kecamatan Sub-district
#> 5 IDN.1.2.5_1 Meureubo <NA> <NA> Kecamatan Sub-district
#> 6 IDN.1.2.6_1 Pantai Ceuremen <NA> <NA> Kecamatan Sub-district
#> CC_3 HASC_3 geometry
#> 1 1107062 <NA> MULTIPOLYGON (((95.97953 4....
#> 2 1107061 <NA> MULTIPOLYGON (((96.16601 4....
#> 3 1107050 <NA> MULTIPOLYGON (((96.13205 4....
#> 4 1107080 <NA> MULTIPOLYGON (((96.16397 4....
#> 5 1107081 <NA> MULTIPOLYGON (((96.25119 4....
#> 6 1107082 <NA> MULTIPOLYGON (((96.18817 4....
To join two dataframes in R, we can use the *_join() function from dplyr package:
df_agg <- df_agg %>%
left_join(idn3, by = c("kota" = "NAME_2", "kecamatan" = "NAME_3"))
head(df_agg)#> # A tibble: 6 x 19
#> kota kecamatan harga_m2 total_listings GID_0 NAME_0 GID_1 NAME_1 NL_NAME_1
#> <chr> <chr> <dbl> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 Beka… Tarumaja… 1.23e7 463 IDN Indon… IDN.… Jawa … <NA>
#> 2 Depok Beji 1.17e7 84 IDN Indon… IDN.… Jawa … <NA>
#> 3 Depok Cimanggis 1.17e7 110 IDN Indon… IDN.… Jawa … <NA>
#> 4 Depok Cinere 1.28e7 88 IDN Indon… IDN.… Jawa … <NA>
#> 5 Jaka… Cengkare… 1.30e7 455 IDN Indon… IDN.… Jakar… <NA>
#> 6 Jaka… Grogolpe… 1.50e7 223 IDN Indon… IDN.… Jakar… <NA>
#> # … with 10 more variables: GID_2 <chr>, NL_NAME_2 <chr>, GID_3 <chr>,
#> # VARNAME_3 <chr>, NL_NAME_3 <chr>, TYPE_3 <chr>, ENGTYPE_3 <chr>,
#> # CC_3 <chr>, HASC_3 <chr>, geometry <MULTIPOLYGON [°]>
Now that our df_agg data have the geographic attributes attached, we can turn it into an sf object using st_as_sf() function:
#> Geometry set for 59 features
#> geometry type: MULTIPOLYGON
#> dimension: XY
#> bbox: xmin: 106.6325 ymin: -6.39886 xmax: 107.0317 ymax: -6.0409
#> geographic CRS: WGS 84
#> First 5 geometries:
ggplot2R is loaded with built-in tools and open source packages to help us turning both sp and sf objects into a neat map visualization. In the first session of this workshop, you have learned about a widely used and powerful plotting library for R, ggplot2. The great thing is that ggplot can plot sf objects directly by using geom_sf.
Recall about the ggplot2 layering system and see the code below:
You can also easily map the color of
harga_m2 by adding the variable inside the aes() in the geom element:
> Dive Deeper
Recall how you can use labs() in ggplot to add label informations, such as title, subtitle, etc. Copy down the previous code above, and provide the map with appropriate labellings on the chunk below!
With small adjustments and feature additions, we have built a nice geographic representation of house pricing in Jakarta. Notice that in the chunk below, I added a layer of theme_algo_map(), which was a customized map I created for this workshop. (You can also create your own theme too!)
The modularity of ggplot2 also allows us to save it as an external file, makes it easy for us to make a reproducible custom theming. You can find the code for theme_algo_map() under assets/theme_algo.R.
source('assets/theme_algo.R')
plot <- ggplot(df_agg)+
geom_sf(aes(fill = harga_m2)) +
labs(title = "Distribusi Harga Rumah di Jabodetabek",
subtitle = "Hasil sample harga di marketplace properti",
caption = "Sample menggunakan ±10,000 listing rumah di laman jual-beli rumah OLX\n September 2020",
fill = "Harga/m2")+
scale_fill_gradient(low = "lavenderblush", high = "red3",
labels = number_format(scale = 1/1e6, suffix = " Jt.", accuracy = 1))+
theme_algo_map()
plot
Another benefit of creating maps with
ggplot2 is how we can easily add a level of interactivity by simply use ggplotly() function from the plotly package.
Since we have stored our most recent plot creation as an object named plot, let’s try to wrap it with ggplotly(plot).
(P.S. If you haven’t install the plotly package, you can run install.packages("plotly") through your console)
You can also add another adjustments such as customizing the tooltip and hide the mode bar for cleaner appearance:
# create new column to store the tooltip information
df_agg <- df_agg %>%
mutate(text = glue("{kecamatan}, {kota}: <br> {number(harga_m2, scale = 1/1e6, suffix = 'Jt.', accuracy = .01)}"))
# re-run the function to create plot
plot <- ggplot(df_agg, aes(text = text))+ # add new aes mapping: text to define the tooltip text
geom_sf(aes(fill = harga_m2)) +
labs(title = "Distribusi Harga Rumah di Jabodetabek",
subtitle = "Hasil sample harga di marketplace properti",
caption = "Sample menggunakan ±10,000 listing rumah di laman jual-beli rumah OLX\n September 2020",
fill = "Harga/m2")+
scale_fill_gradient(low = "lavenderblush", high = "red3",
labels = number_format(scale = 1/1e6, suffix = " Jt.", accuracy = 1))+
theme_algo_map()
# re-run the `ggplotly`
ggplotly(plot, tooltip = "text") %>% # define tooltip
config(displayModeBar = F) # hide the modebarDive Deeper
Using the same method as above, try to create a map that represent the number of listed properties(total_listings) in each sub-district! Once you’re done, you can also try to customize your own map appearance!
leafletAmong all available mapping packages in R, leaflet has became the most widely used for building interactive maps in R. It provides a relatively low-level interface to the Leaflet JavaScript library. The full documentation of the package can be accessed here.
Taken from its official documentation5, to create a Leaflet map we can follow these basic steps:
leaflet().addTiles, addMarkers, addPolygons) to modify the map widget.Now, let’s follow the steps above to recreate the previous choropleth as a Leaflet map.
# leaflet
leaflet(df_agg) %>% # create map widget
addTiles() %>% # add basemap
addPolygons() # add polygons from `sf` data %By default, addTiles() used tiles from OpenStreetMap. We can also use third-party basemaps by using addProviderTiles() instead. The lists of complete set basemaps provided by leaflet-providers can be found here.
#> [1] "OpenStreetMap"
#> [2] "OpenStreetMap.Mapnik"
#> [3] "OpenStreetMap.DE"
#> [4] "OpenStreetMap.CH"
#> [5] "OpenStreetMap.France"
#> [6] "OpenStreetMap.HOT"
#> [7] "OpenStreetMap.BZH"
#> [8] "OpenSeaMap"
#> [9] "OpenPtMap"
#> [10] "OpenTopoMap"
#> [11] "OpenRailwayMap"
#> [12] "OpenFireMap"
#> [13] "SafeCast"
#> [14] "Thunderforest"
#> [15] "Thunderforest.OpenCycleMap"
#> [16] "Thunderforest.Transport"
#> [17] "Thunderforest.TransportDark"
#> [18] "Thunderforest.SpinalMap"
#> [19] "Thunderforest.Landscape"
#> [20] "Thunderforest.Outdoors"
#> [21] "Thunderforest.Pioneer"
#> [22] "Thunderforest.MobileAtlas"
#> [23] "Thunderforest.Neighbourhood"
#> [24] "OpenMapSurfer"
#> [25] "OpenMapSurfer.Roads"
#> [26] "OpenMapSurfer.Hybrid"
#> [27] "OpenMapSurfer.AdminBounds"
#> [28] "OpenMapSurfer.ContourLines"
#> [29] "OpenMapSurfer.Hillshade"
#> [30] "OpenMapSurfer.ElementsAtRisk"
#> [31] "Hydda"
#> [32] "Hydda.Full"
#> [33] "Hydda.Base"
#> [34] "Hydda.RoadsAndLabels"
#> [35] "MapBox"
#> [36] "Stamen"
#> [37] "Stamen.Toner"
#> [38] "Stamen.TonerBackground"
#> [39] "Stamen.TonerHybrid"
#> [40] "Stamen.TonerLines"
#> [41] "Stamen.TonerLabels"
#> [42] "Stamen.TonerLite"
#> [43] "Stamen.Watercolor"
#> [44] "Stamen.Terrain"
#> [45] "Stamen.TerrainBackground"
#> [46] "Stamen.TerrainLabels"
#> [47] "Stamen.TopOSMRelief"
#> [48] "Stamen.TopOSMFeatures"
#> [49] "TomTom"
#> [50] "TomTom.Basic"
#> [51] "TomTom.Hybrid"
#> [52] "TomTom.Labels"
#> [53] "Esri"
#> [54] "Esri.WorldStreetMap"
#> [55] "Esri.DeLorme"
#> [56] "Esri.WorldTopoMap"
#> [57] "Esri.WorldImagery"
#> [58] "Esri.WorldTerrain"
#> [59] "Esri.WorldShadedRelief"
#> [60] "Esri.WorldPhysical"
#> [61] "Esri.OceanBasemap"
#> [62] "Esri.NatGeoWorldMap"
#> [63] "Esri.WorldGrayCanvas"
#> [64] "OpenWeatherMap"
#> [65] "OpenWeatherMap.Clouds"
#> [66] "OpenWeatherMap.CloudsClassic"
#> [67] "OpenWeatherMap.Precipitation"
#> [68] "OpenWeatherMap.PrecipitationClassic"
#> [69] "OpenWeatherMap.Rain"
#> [70] "OpenWeatherMap.RainClassic"
#> [71] "OpenWeatherMap.Pressure"
#> [72] "OpenWeatherMap.PressureContour"
#> [73] "OpenWeatherMap.Wind"
#> [74] "OpenWeatherMap.Temperature"
#> [75] "OpenWeatherMap.Snow"
#> [76] "HERE"
#> [77] "HERE.normalDay"
#> [78] "HERE.normalDayCustom"
#> [79] "HERE.normalDayGrey"
#> [80] "HERE.normalDayMobile"
#> [81] "HERE.normalDayGreyMobile"
#> [82] "HERE.normalDayTransit"
#> [83] "HERE.normalDayTransitMobile"
#> [84] "HERE.normalDayTraffic"
#> [85] "HERE.normalNight"
#> [86] "HERE.normalNightMobile"
#> [87] "HERE.normalNightGrey"
#> [88] "HERE.normalNightGreyMobile"
#> [89] "HERE.normalNightTransit"
#> [90] "HERE.normalNightTransitMobile"
#> [91] "HERE.reducedDay"
#> [92] "HERE.reducedNight"
#> [93] "HERE.basicMap"
#> [94] "HERE.mapLabels"
#> [95] "HERE.trafficFlow"
#> [96] "HERE.carnavDayGrey"
#> [97] "HERE.hybridDay"
#> [98] "HERE.hybridDayMobile"
#> [99] "HERE.hybridDayTransit"
#> [100] "HERE.hybridDayGrey"
#> [101] "HERE.hybridDayTraffic"
#> [102] "HERE.pedestrianDay"
#> [103] "HERE.pedestrianNight"
#> [104] "HERE.satelliteDay"
#> [105] "HERE.terrainDay"
#> [106] "HERE.terrainDayMobile"
#> [107] "FreeMapSK"
#> [108] "MtbMap"
#> [109] "CartoDB"
#> [110] "CartoDB.Positron"
#> [111] "CartoDB.PositronNoLabels"
#> [112] "CartoDB.PositronOnlyLabels"
#> [113] "CartoDB.DarkMatter"
#> [114] "CartoDB.DarkMatterNoLabels"
#> [115] "CartoDB.DarkMatterOnlyLabels"
#> [116] "CartoDB.Voyager"
#> [117] "CartoDB.VoyagerNoLabels"
#> [118] "CartoDB.VoyagerOnlyLabels"
#> [119] "CartoDB.VoyagerLabelsUnder"
#> [120] "HikeBike"
#> [121] "HikeBike.HikeBike"
#> [122] "HikeBike.HillShading"
#> [123] "BasemapAT"
#> [124] "BasemapAT.basemap"
#> [125] "BasemapAT.grau"
#> [126] "BasemapAT.overlay"
#> [127] "BasemapAT.highdpi"
#> [128] "BasemapAT.orthofoto"
#> [129] "nlmaps"
#> [130] "nlmaps.standaard"
#> [131] "nlmaps.pastel"
#> [132] "nlmaps.grijs"
#> [133] "nlmaps.luchtfoto"
#> [134] "NASAGIBS"
#> [135] "NASAGIBS.ModisTerraTrueColorCR"
#> [136] "NASAGIBS.ModisTerraBands367CR"
#> [137] "NASAGIBS.ViirsEarthAtNight2012"
#> [138] "NASAGIBS.ModisTerraLSTDay"
#> [139] "NASAGIBS.ModisTerraSnowCover"
#> [140] "NASAGIBS.ModisTerraAOD"
#> [141] "NASAGIBS.ModisTerraChlorophyll"
#> [142] "NLS"
#> [143] "JusticeMap"
#> [144] "JusticeMap.income"
#> [145] "JusticeMap.americanIndian"
#> [146] "JusticeMap.asian"
#> [147] "JusticeMap.black"
#> [148] "JusticeMap.hispanic"
#> [149] "JusticeMap.multi"
#> [150] "JusticeMap.nonWhite"
#> [151] "JusticeMap.white"
#> [152] "JusticeMap.plurality"
#> [153] "Wikimedia"
#> [154] "GeoportailFrance"
#> [155] "GeoportailFrance.parcels"
#> [156] "GeoportailFrance.ignMaps"
#> [157] "GeoportailFrance.maps"
#> [158] "GeoportailFrance.orthos"
#> [159] "OneMapSG"
#> [160] "OneMapSG.Default"
#> [161] "OneMapSG.Night"
#> [162] "OneMapSG.Original"
#> [163] "OneMapSG.Grey"
#> [164] "OneMapSG.LandLot"
To access the supported tile providers, we can also useproviders$ and choose from the options:
Let’s now color the polygons based on the house pricing. First, we need to create a color palette to represent the data. In the chunk below, we create an object called pal which stores the colors generated from df_agg$harga_m2 values. The value passed to fill in palette is provided by ColorBrewer2 palettes.
Notice that on the next line, inside the addPolygons() function call, we also add some parameters:
fillColor: the values and palette to be mappedfillOpacity: the fillColor opacityweight: thickness of the mapped polygonscolor: color of the mapped polygonspal <- colorNumeric(palette = "Reds", domain = df_agg$harga_m2)
leaflet(df_agg) %>%
addProviderTiles(providers$Esri.NatGeoWorldMap) %>%
addPolygons(
fillColor = ~pal(harga_m2),
fillOpacity = .8,
weight = 2,
color = "white"
)To add more interaction to our map, we can also make the polygons highlight as the mouse passes over them. We can do it by defining the highlight argument inside the addPolygons() function:
leaflet(df_agg) %>%
addProviderTiles(providers$Esri.NatGeoWorldMap) %>%
addPolygons(
fillColor = ~pal(harga_m2),
fillOpacity = .8,
weight = 2,
color = "white",
highlight = highlightOptions(
weight = 5,
color = "black",
bringToFront = TRUE,
opacity = 0.8
)
)Now let’s also add labels information as we hover over each sub-district mapped. To add the label, we need to create a HTML object using htmltools::HTML which stores the value of information we desired to be shown:
labels <- glue::glue("
<b>{df_agg$kecamatan}, {df_agg$kota}</b>:<br> {round(df_agg$harga_m2/1e6, 2)} jt/m2"
) %>% lapply(htmltools::HTML)Then, to map it, we can add label argument inside the addPolygons() function.
leaflet(df_agg) %>%
addProviderTiles(providers$Esri.NatGeoWorldMap) %>%
addPolygons(
label = labels,
fillColor = ~pal(harga_m2),
fillOpacity = .8,
weight = 2,
color = "white",
highlight = highlightOptions(
weight = 5,
color = "black",
bringToFront = TRUE,
opacity = 0.8
)
)As the final step, we might also need to add a legend to give information about the colors and intervals. To add the legend, we only a layer of addLegend() function:
leaflet(df_agg) %>%
addProviderTiles(providers$Esri.NatGeoWorldMap) %>%
addPolygons(
label = labels,
fillColor = ~pal(harga_m2),
fillOpacity = .8,
weight = 2,
color = "white",
highlight = highlightOptions(
weight = 5,
color = "black",
bringToFront = TRUE,
opacity = 0.8
)
) %>%
addLegend(
pal = pal,
values = ~harga_m2,
opacity = 1,
title = "Harga/m2",
position = "bottomright"
)Dive Deeper
______) below to get the geometry for our Jakarta border!# Copy down this code to the chunk below:
border <- df_agg %>%
filter( ______== ______) %>%
group_by(______) %>%
______() Complete Code:
pal <- colorNumeric(palette = "Reds", domain = df_agg$harga_m2)
labels <- glue::glue("
<b>{df_agg$kecamatan}, {df_agg$kota}</b>:<br> {round(df_agg$harga_m2/1e6, 2)} jt/m2"
) %>% lapply(htmltools::HTML)
border <- df_agg %>%
filter(NAME_1 == "Jakarta Raya") %>%
group_by(NAME_1) %>%
summarise()
leaflet(df_agg) %>%
addProviderTiles(providers$Esri.NatGeoWorldMap) %>% # using `addProviderTiles()` instead of `addTiles()`
addPolygons(
label = labels,
labelOptions = labelOptions(
style = list(
"font-size"="13px",
"background-color"="black",
"color"="white"
)
),
weight = 2,
color = "white",
fillOpacity = .8,
fillColor = ~pal(harga_m2),
highlight = highlightOptions(
weight = 5,
color = "black",
bringToFront = TRUE,
sendToBack = TRUE,
opacity = 0.8)
) %>%
addPolylines(
data = border,
color = "darkred",
opacity = .8,
weight = 2.5
) %>%
addLegend(
pal = pal,
values = ~harga_m2,
opacity = 1,
title = "Harga/m2",
position = "bottomright"
) %>%
fitBounds(106.686211, -6.370783, 106.972824, -6.089036)In general, spatial data can be represented either as reference maps or thematic maps. While reference maps emphasizes the location of objects in the world, thematic maps shows the spatial variability of a specific distribution. The house pricing distribution map we created earlier is among the most frequently used thematic map types in geospatial data, which is the Choropleth map.
Figure 4.1: Source: Mapping Ideas from Cyberspace to Realspace
As an example, let’s take a look at this dataset of Perumahan (housing estate) locations in Jakarta area:
#> perumahan kota latitude longitude
#> 1 Bintaro Jaya Sektor 2 Jakarta Selatan -6.278108 106.7522
#> 2 Buaran Regency Jakarta Timur -6.232117 106.9251
#> 3 Casa Permata Hijau Jakarta Selatan -6.221427 106.7832
#> 4 Cempaka Residence Jakarta Selatan -6.291828 106.7779
#> 5 Green Lake City Jakarta Barat -6.177477 106.7113
#> 6 Green Lontar Jakarta Selatan -6.364691 106.7983
Let’s try to create a dot density map using leaflet’s addCircles() function:
Another common way of representing geographical density map is by using a heatmap. To create a heatmap in leaflet, we can use leaflet.extras package:
Geospatial mapping is an active area in R community, which makes R loaded with many other packages that supported spatial visualization. Packages like cartography, mapview, ggspatial, rasterVis and many other libraries that you can explore. Other than ggplot2 and leaflet that we used in this workshop, tmap is also one of the most popular package used to create both static and interactive maps in R.
tmap also works similarly with ggplot2, since it is also based on the idea of Grammar of Graphics6:
Just like how you can use facets in
ggplot2, you can also create it with tm_facets(), for example, to split our map based on each city:
tmap also has a nice feature that allows us to add interactivity by switching from “plot” to “view” mode like so:
Kristin Stock, Hans Guesgen, in Automating Open Source Intelligence, 2016↩︎
Edzer Pebesma, Daniel Nüst, and Roger Bivand, “The R Software Environment in Reproducible Geoscientific Research,” Eos 93 (2012): 163–164.↩︎
Robin Lovelace, Jakub Nowosad, Jannes Muenchow, “Geocomputation with R”↩︎
Wilkinson, Leland, and Graham Wills. 2005. The Grammar of Graphics. Springer Science+ Business Media.↩︎